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Abstract. This paper describes a new feature set for use in the recogni¬ 
tion of on-line handwritten Devanagari script based on Fuzzy Directional 
Features. Experiments are conducted for the automatic recognition of 
isolated handwritten character primitives (sub-character units). Initially 
we describe the proposed feature set, called the Fuzzy Directional Fea¬ 
tures (FDF) and then show how these features can be effectively utilized 
for writer independent character recognition. Experimental results show 
that EDE set perform well for writer independent data set at stroke level 
recognition. The main contribution of this paper is the introduction of 
a novel feature set and establish experimentally its ability in recognition 
of handwritten Devanagari script. 


1 Introduction 

Interest in on-line handwritten script recognition has been active for a long time. 
In the case of Indian languages, research work is active especially for Devanagari 
[6,9], Bangala [10,3], Telugu [1] and Tamil [12,2] to name a few. In English 
script, the mostly widely researched, barring a few alphabets, all the alphabets 
can be written in a single stroke. But most of the Indian languages have char¬ 
acters which are made up of two or more strokes which makes it necessary to 
analyze a set of strokes to identify the entire character. We identified, through 
visual inspection of the script, a basis like set of 44 strokes^ called primitives 
which are sufficient to represent all the characters in Devanagari script. The set 
of primitives used to write the complete Devanagari character set are shown in 
Figure 1(a). In this paper we use these primitives as the units for recognition. In 
an unconstrained handwriting these primitive strokes exhibit large variability in 
shape, direction and order of writing. A sample set of primitives collected from 
different writers is shown in Figure 1(b) to capture the variability in the way 
primitives are written. The variations within the primitives even for the same 
writer is evident and it is observed that the variation among different writers is 
even larger; making the task of recognizing these primitives difficult. 

^ Usually the segment of pen motion from the pen-down to the pen-up position is a 
loose definition of a stroke 
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(a) Primitive that can be used to 
write the complete alphabet set in 
Devanagari. 
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(b) Variability in writing primitives. 


Fig. 1. (a) Primitive handwritten strokes, (b) Wide variability is observed 


The main challenge in on-line handwritten character recognition in Indian 
language is the large size of the character set, variation in writing style (when 
the same stroke is written by different writers or the same writer at different 
times) and the similarity between different characters in the script. 

In this paper, we propose the use of Fuzzy Directional Features (FDF) set 
for the recognition of the primitives (which are also strokes). The variations that 
exist in the primitives (see Figure 1(b)) test the ability of the proposed features 
to recognize handwritten script. The rest of the paper is organized as follows. We 
introduce the Fuzzy Directional Features set in Section 2. Experimental results 
are outlined in Section 3, and conclusions are drawn in Section 4. 

2 Fuzzy Directional Features Extraction 

Several temporal features have been used for script recognition in general[7, 5, 
11,4] and for on-line Devanagari script recognition in particular. We propose 
a simple yet effective feature set based on fuzzy directional feature set^. The 
detailed procedure for obtaining these directional features is given below. 

Let an on-line handwritten character be represented by a variable number 
of 2D points which are in a time sequence. For example an on-line script would 
be represented as {xt 2 ^yt 2 )^ ’ ’ ’ i where, t denotes the time 

and ti < t 2 < ''' < tn‘ Equivalently we can represent the on-line character (see 
Fig 2(a),2(c)) as 

{(a;i, 2/i), {X2,y2), • • •, (a;„, y„)} 

by dropping the variable t. The number of points denoted by n vary depending 
on the size of the character and also the speed with which the character is 

^ Note that [8] talks of fuzzy feature set for Devanagari script albeit for offline script 

















written. Most script digitizing devices (popularly called electronic pen) sample 
the script uniformly in time, generally at 100 Hz. For this reason, the number 
of sampling points is large when the writing speed is slow which is especially 
true at curvatures (see Figure 2(a), 2(c)); we exploit these curvature points in 
extracting FDF. 



(a) 




(b) (c) 



(d) 


Fig. 2. Sample character (a, c) and its curvature points (b, d). 


We first identify the curvature points (called critical points) from the smoothed 
(we use discrete wavelet transform) handwriting data. The sequence 
represents the handwriting data of a stroke. We treat the sequence Xi and yi sep¬ 
arately and calculate the critical points for each of these time sequence. For the x 
sequence, we calculate the first difference x[ = sgn{xi—XiJ^i) where sgn{P) = +1 
if /3 > 0, sgn{P) = —1 if /3 < 0 and sgn{P) = 0 if /3 = 0. We use x' to compute 
the critical point. The point i is a critical point if x[ — x[j^i ^ 0. Similarly we 
calculate the critical points for the y sequence. The final list of critical points 
is the union of all the points marked as critical points by both the x and the y 
sequence (see Figure 2(b), 2(d)). It must be noted that the position and num¬ 
ber of curvature points computed for different samples of the same strokes vary. 
Trimming of curvature points is carried out on the obtained k — 1 direction se¬ 
quence by removing all spurious curvature point. A curvature point is said to be 
spurious if a set of three curvature points results in the same direction. For the 
sake of discussion lets assume that there are no spurious curvature points. 

Let k be the number of curvature points (denoted by ci, C 2 , • • • c/.) extracted 
from a stroke of length n; usually k « n. The k critical points form the basis 
for extraction of the FDF. We first compute the angle between the two curvature 
points, say q and c^, as 


Oim = tan ^ 


f yi-ym \ 
\Xi -XmJ 


where {xi^yi) and (xm^ym) are the coordinates corresponding to the curvature 
point Cl and Cm respectively. The FDF set is computed using Oim- We use Algo¬ 
rithm 1 (assisted by triangular membership function. Algorithm 2) to compute 
the FDF. Note that every Oim (represented by 0 in Figure 3(a) is the angle 
the blue dotted line makes with the 0*^ axis) has two directions (say = 1, 
d‘fm = 2, note that the line in dotted blue in Figure 3(a) lies in both the triangles 
represented by direction 1 and direction 2) associated with it having 








membership values respectively (represented by the green and the red dot re¬ 
spectively in Figure 3(a)). Also note (a) mj^ + rn^m ~ ^ (^) ^‘im 

adjacent directions, for example if d\^ = 5 then could be either 4 or 6. 



(a) 0 contributing to two 
directions (1, 2) with cor¬ 
responding membership 
values (green and red dot) 
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(b) Fuzzy Directional Features 


Fig. 3. Fuzzy Directional Features 


Algorithm 1 Computing Fuzzy Directional Features 
deg2fuzzydir(double 0) 
i=l; d[i] = -1; m[i] = -1 
if {9 > —7r/4 k. 9 < 7r/4) then 

d[i] = 1; m[i] = fuzzy_membership(0,^); 
end if 

if {9 >— ^ k 9 < 27r/4) then 

d[i] = 2; m[i] = fuzzy_membership(27r/4,^); 

i++; 

end if 


{Similarly for d[i] = 3, 5, 6, 7} 

if {9 > —27r/4 k 9 <^) then 

d[i] = 8; m[i] = fuzzy_membership(0,^); 
i++; 

end if 

return(d[i], m[i]); 


It should be noted that the sum of the membership functions of a particular 
row (see Figure 3(b)) is always 1. Given an on-line character, we extract the FDF 
shown in Figure 3(b). Then we calculate the mean FDF by averaging across the 
columns, so as to form a vector of dimension 8. The mean is calculated as follows; 
for each direction (1 to 8), collect all the membership values and divide by the 
number of occurrences of the membership values in that direction. For example. 

















Algorithm 2 Triangular Fuzzy Membership Function 


fuzzy_membership(^c, 0) 


m — 1.0 — 
return(m) 


(7r/4) ’ 


for Figure 3(b), the mean for direction 1 is calculated as fi = (^ 23 +^ 34 ) ^ 
our experiments we have used this mean FDF 




( 1 ) 


to represent a stroke. 

3 Experimental Analysis 

For experimental analysis, we collected handwritten data from 10 persons, each 
of whom wrote all the primitives of Devanagari text using Mobile e-Notes Taker^. 
This raw stroke data is smoothed using Discrete Wavelet Transform (DWT) de¬ 
composition^ to remove noise in terms of small undulation due to the sensitive¬ 
ness of the sensors on the electronic pen. For each stroke we extracted the fuzzy 
directional feature set as described in Section 2 . We used 5 user data for training 
and the other 5 for the purpose of testing the performance of the FDF set. We 
initially hand tagged each stroke in the collected data using the 44 primitives 
that we selected (see Figure 1(a)). 

For training, we calculated (1) for all strokes corresponding to the same 
primitive and computed the average to model the primitive. So a primitive was 
represented by a vector of size 8 by taking the average over all the occurrences 
of the primitive in the training set. All the experimental results are based on 
this data set (from 10 different writers). 

For testing purpose, we took a stroke (t) to be recognized, we first extracted 
FDF (using Algorithm 1) and computed the mean FDF using (1). Then we 
compared it with the FDF model of the 44 reference strokes using the usual 
Euclidean distance measure. We computed for the test stroke t, its distance 
from all the primitives, namely, \\Ft — for i = 1, • • • 44 and arranged them 
in the increasing order of magnitude (best match first). The results for this are 
shown in Table 1 for both the train data and the test data for a = 1,2,5. Note 
that the values in Table 1 are computed as follow. For N = a, the test stroke 
t is recognized as the primitive I if I occurs atleast at the position from the 
best match (this is generally called the N-best in literature). As expected the 
recognition accuracies are poor (very similar to the phoneme recognition by a 
speech engine) ioi a = 1 and improves with increasing a. It should be noted 
that the accuracies are writer independent and for stroke level recognition. 

^ http://www.hitech-in.com/mobile_e-note_taker.htm 

^ We do not dwell on this since this is well covered in pattern recognition literature. 



Table 1. Recognition accuracies for train and test data set. 


Data 

a — 1 

a — 2 

a — b 

Train Data 

63.0% (139/220) 

87.9% (193/220) 

93.3%(205/220) 

Test Data 

37.0% (82/220) 

54.6% ^20/220) 

78.1% (172/220) 


4 Conclusions 

In this paper we have introduces a new on-line script feature set, called the Fuzzy 
Directional Features. We have evaluated the performance of the novel feature 
set by presented recognition accuracies for writer independent stroke level data 
set. It is well known, both in speech and script recognition literature that stroke 
(phoneme in case of speech) recognition is always poor. As in speech we plan use 

(a) Viterbi traceback to enhance alphabet (multiple stroke) recognition and/or 

(b) cluster strokes using spatio-temporal information to form alphabets and then 
use the cluster of strokes to recognize them. This we believe will lead to better 
accuracies of writer independent script recognition. 
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